An Ngram-based reordering model
نویسندگان
چکیده
This paper describes in detail a novel approach to the reordering challenge in statistical machine translation (SMT). This Ngram-based reordering (NbR) approach uses the powerful techniques of SMT systems to generate a weighted reordering graph. Thus, statistical criteria reordering constraints are supplied to an SMT system, and this allows an extension to the SMT decoding search. The NbR approach is capable of generalizing reorderings that have been learned during training, through the use of word classes instead of words themselves. Improvement in translation performance is demonstrated with the EPPS task (Spanish and German to English) and the BTEC task (Arabic to English). 2008 Elsevier Ltd. All rights reserved.
منابع مشابه
Using Linear Interpolation and Weighted Reordering Hypotheses in the Moses System
This paper proposes to introduce a novel reordering model in the open-source Moses toolkit. The main idea is to provide weighted reordering hypotheses to the SMT decoder. These hypotheses are built using a first-step Ngram-based SMT translation from a source language into a third representation that is called reordered source language. Each hypothesis has its own weight provided by the Ngram-ba...
متن کاملReordered Search and Tuple Unfolding for Ngram-based SMT
In Statistical Machine Translation, the use of reordering for certain language pairs can produce a significant improvement on translation accuracy. However, the search problem is shown to be NP-hard when arbitrary reorderings are allowed. This paper addresses the question of reordering for an Ngram-based SMT approach following two complementary strategies, namely reordered search and tuple unfo...
متن کاملNgram-Based Statistical Machine Translation Enhanced with Multiple Weighted Reordering Hypotheses
This paper describes the 2007 Ngram-based statistical machine translation system developed at the TALP Research Center of the UPC (Universitat Politècnica de Catalunya) in Barcelona. Emphasis is put on improvements and extensions of the previous years system, being highlighted and empirically compared. Mainly, these include a novel word ordering strategy based on: (1) statistically monotonizing...
متن کاملAn Ngram-based Statistical Machi
In this paper we describe MARIE, an Ngram-based statistical machine translation decoder. It is implemented using a beam search strategy, with distortion (or reordering) capabilities. The underlying translation model is based on an Ngram approach, extended to introduce reordering at the phrase level. The search graph structure is designed to perform very accurate comparisons, what allows for a h...
متن کاملThe Operation Sequence Model - Combining N-Gram-Based and Phrase-Based Statistical Machine Translation
In this article, we present a novel machine translation model, the Operation Sequence Model (OSM), that combines the benefits of phrase-based and N-gram-based SMT and remedies their drawbacks. The model represents the translation process as a linear sequence of operations. The sequence includes not only translation operations but also reordering operations. As in Ngram-based SMT, the model is: ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computer Speech & Language
دوره 23 شماره
صفحات -
تاریخ انتشار 2009